Transcriptome pipelines steps and reports ################################### Here, we outline the pipeline steps and reports for the Transcriptome pipeline, encompassing RNA-seq, RNA-seq with UMI, MARS-seq, and DESeq2 from counts matrix pipelines. Analysis pipeline steps ----------------------- The pipeline: 1. Trims adapter sequences 2. Runs FastQC on the trimmed sequences for quality control of the samples, in parallel with the steps that follow 3. Maps reads to the selected reference genome 4. Adds UMI and gene information to the reads 5. Quantifies gene expression by counting reads 6. Counts UMI's for cases of PCR bias 7. Detects Differentially Expressed (DE) genes for a model with a single factor Steps 4 and 6 are performed only for MARS-Seq and RNA-seq with UMI Steps 7 is performed only if DESeq2 is selected Steps 1-6 are not performed for DESeq2 from counts matrix pipeline .. image:: ../../figures/rna-seq_workflow.jpg Pipeline report --------------- Upon completion of the analysis, you will be sent an email with links to the results report. The report includes several sections: 1. Sequencing and Mapping QC a. `Figure 1 `_ - Plots the average quality of each base across all reads. Qualities of 30 (predicted error rate 1:1000) and above are good b. `Figure 2 `_ - Histogram showing the number of reads for each sample in the raw data c. Figure 3 - Histogram showing the percentage of reads discarded after trimming the adapters (after removing adapters, short, polyA/T and low quality reads are discarded by the pipeline). No figure presented since the percentage of reads discarded after trimming for all samples is lower than 1%. d. `Figure 4 `_ - Histogram with the number of reads for each sample in each step of the pipeline e. `Figure 5 `_ - Plots sequence coverage on and near gene regions 2. Exploratory Analysis a. `Figure 6 `_ - Heatmap plotting the fraction of reads from the genes with the most counts b. `Figure 7 `_ - Heatmap of Pearson correlation between samples according to gene expression values c. `Figure 8 `_ - Clustering dendrogram of the samples according to gene expression d. Figure 9 - PCA analysis i. `Histogram of % explained variability for each PC component `_ ii. `PCA plot of PC1 vs PC2 `_ iii. `PCA plot of PC1 vs PC3 `_ 3. `Differential Expression Analysis `_ (this section exists only if you run the DESeq2 analysis) - A table with the number of differentially expressed genes (DE) in each category (up/down) for the different contrasts. In addition, links for p-value distribution, volcano plots and heatmaps, as well as a table of the DE genes with dot plots of their expression values are also provided 4. `Bioinformatics Pipeline Methods `_ - Description of pipeline methods. 5. `Links to additional results `_ - Links for downloading tables with raw, normalized counts, log normalized values (rld), and statistical data of contrasts. In cases of models with batches, "combat" values calculated (instead of rld) using the "sva" package, providing batch corrected normalized log2 count values. Note that only Figure 2 from Step 1, as well as Steps 2–5, will appear in the DESeq2 counts matrix report. Output folders for RNA-seq pipeline -------------- 0_concatenating_fastq 1_cutadapt 2_fastqc 3_mapping 4_reports Log directory Output folders for MARS-seq and RNA-seq with-UMI pipelines -------------- 1_combined_fastq 2_cutadapt 3_fastqc 4_mapping 5_move_umi 6_count_reads 7_mark_dup 8_dedup_counts 9_umi_counts 10_reports Log directory Output folders for DESeq2 from counts matrix pipeline -------------- Log file Annotation file --------------- For counts of the reads per gene, we use annotation files (gtf format) from "Ensembl" or "GENCODE". In MARS-seq analysis, we extend the 3' UTR exon away from the transcript on the DNA and extend or cut the 3' UTR exon towards the 5' direction on the mRNA. Examples of reports ------------------- `RNA-seq example `_ `Mars-seq example `_ `RNA-seq with UMI example `_ `DESeq2 from counts matrix example `_ Note: This example analysis demonstrates a good starting point, and not necessarily an end result.